One-rate models outperform two-rate models in site-specific dN/dS estimation
نویسندگان
چکیده
Methods that infer site-specific dN/dS, the ratio of nonsynonymous to synonymous substitution rates, from coding data have been developed primarily to identify positively selected sites (dN/dS > 1). As a consequence, it is largely unknown how well different inference methods can infer dN/dS point estimates at individual sites. In particular, dN/dS may be estimated using either a one-rate approach, where dN/dS is parameterized as a single parameter, or a two-rate approach, in which dN and dS are estimated separately. While some have suggested that the two-rate paradigm may be preferred for positive-selection inference, the relative merits of these two paradigms for site-specific dN/dS estimation remain largely untested. Here, we systematically assess how accurately several popular inference frameworks infer site-specific dN/dS values using alignments simulated within a mutation-selection framework rather than within a dN/dS-based framework. As mutation-selection models describe long-term evolutionary constraints, our simulation approach further allows us to study under what conditions inferred dN/dS captures the underlying equilibrium evolutionary process. We find that one-rate inference models universally outperform two-rate models. Surprisingly, we recover this result even for data simulated with codon bias (i.e., dS varies among sites). Therefore, even when extensive dS variation exists, modeling this variation substantially reduces accuracy. We additionally find that high levels of divergence among sequences, rather than the number of sequences in the alignment, are more critical for obtaining precise point estimates. We conclude that inference methods which model dN/dS with a single parameter are the preferred choice for estimating reliable site-specific dN/dS ratios. 2 . CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/032805 doi: bioRxiv preprint first posted online Nov. 24, 2015;
منابع مشابه
Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates
Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN∕dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN∕dS values relate to Rate4Site scores is not known...
متن کاملA Comparison of One-Rate and Two-Rate Inference Frameworks for Site-Specific dN/dS Estimation.
Two broad paradigms exist for inferring [Formula: see text] the ratio of nonsynonymous to synonymous substitution rates, from coding sequences: (i) a one-rate approach, where [Formula: see text] is represented with a single parameter, or (ii) a two-rate approach, where [Formula: see text] and [Formula: see text] are estimated separately. The performances of these two approaches have been well s...
متن کاملStudy of Diversity and Estimation of Leaf Area in Different Mint Ecotypes Using Artificial Intelligence and Regression Models under Salinity Stress Conditions
Leaf area is a key indicator for the growth and production of plant products and also determines the efficiency of light consumption. Therefore, the study of diversity and also the estimation of leaf area in different mint ecotypes is particular importance. One of the common methods for estimating leaf area is regression analysis, the leaf area as independent variable, and leaf length and ...
متن کاملComparing the performance of GARCH (p,q) models with different methods of estimation for forecasting crude oil market volatility
The use of GARCH models to characterize crude oil price volatility is widely observed in the empirical literature. In this paper the efficiency of six univariate GARCH models and two methods of estimation the parameters for forecasting oil price volatility are examined and the best method for forecasting crude oil price volatility of Brent market is determined. All the examined models in this p...
متن کاملPREDICTIVE MODELS OF THE DOMINANT PERIOD OF SITE USING ARTIFICIAL NEURAL NETWORK AND MICROTREMOR MEASUREMENTS: APPLICATION TO URMIA, IRAN
Direct drilling method and the use of microtremor studies are among the most commonly used available methods utilized to estimate dynamic parameters for a site. One of the most important parameters is the dominant period of the site whose estimation plays a pivotal role in seismic hazard mitigation. The conventional models obtained are not capable of estimating the parameters that govern the se...
متن کامل